本工作详细介绍了3D级不变特征变换(SIFT)算法的高效实现,用于从大组体积的体积图像数据的机器学习的目的。 3D SIFT代码的主要操作在图形处理单元(GPU)上实现,包括从刻度空间金字塔的卷积,子采样和4D峰值检测。使用3D MRI人脑体积的不同人的3D MRI人脑体积来量化性能改进。基于二进制强大的独立基本特征(简要)代码提出了计算有效的3D Keypoint描述符,包括新颖的描述符,我们调用排名强大的独立基本特征(Rrief),并与原始3D Sift-andal方法\ CITEP {Toews2013 effity}相比。 。 GPU实现提供了超出优化CPU实现的大约7倍的加速,其中33秒到0.2秒,用于具有大约3000个关键点的3D尺寸(145,174,145)体素的3D卷到0.2秒。值得注意的加速包括卷积操作(20x),4d峰值检测(3x),子采样(3x)和高斯金字塔结构(2x)。高效的描述符与标准SIFT-RANDS描述符相比,使用2x的加速和6倍的内存节省,以减少的关键点对应关系,在计算效率和算法性能之间揭示折衷。我们实现的加速将允许对较大数据集进行更有效的分析。我们的优化GPU实现了3D Sift-Rank Extractor的HTTPS://github.com/carluerjb/3d_sift_cuda可用。
translated by 谷歌翻译
Many real-world applications of language models (LMs), such as code autocomplete and writing assistance, involve human-LM interaction, but the main LM benchmarks are non-interactive, where a system produces output without human intervention. To evaluate human-LM interaction, we develop a framework, Human-AI Language-based Interaction Evaluation (H-LINE), that expands non-interactive evaluation along three dimensions, capturing (i) the interactive process, not only the final output; (ii) the first-person subjective experience, not just a third-party assessment; and (iii) notions of preference beyond quality. We then design five tasks ranging from goal-oriented to open-ended to capture different forms of interaction. On four state-of-the-art LMs (three variants of OpenAI's GPT-3 and AI21's J1-Jumbo), we find that non-interactive performance does not always result in better human-LM interaction and that first-person and third-party metrics can diverge, suggesting the importance of examining the nuances of human-LM interaction.
translated by 谷歌翻译
State-of-the-art brain tumor segmentation is based on deep learning models applied to multi-modal MRIs. Currently, these models are trained on images after a preprocessing stage that involves registration, interpolation, brain extraction (BE, also known as skull-stripping) and manual correction by an expert. However, for clinical practice, this last step is tedious and time-consuming and, therefore, not always feasible, resulting in skull-stripping faults that can negatively impact the tumor segmentation quality. Still, the extent of this impact has never been measured for any of the many different BE methods available. In this work, we propose an automatic brain tumor segmentation pipeline and evaluate its performance with multiple BE methods. Our experiments show that the choice of a BE method can compromise up to 15.7% of the tumor segmentation performance. Moreover, we propose training and testing tumor segmentation models on non-skull-stripped images, effectively discarding the BE step from the pipeline. Our results show that this approach leads to a competitive performance at a fraction of the time. We conclude that, in contrast to the current paradigm, training tumor segmentation models on non-skull-stripped images can be the best option when high performance in clinical practice is desired.
translated by 谷歌翻译
With the rise of AI in recent years and the increase in complexity of the models, the growing demand in computational resources is starting to pose a significant challenge. The need for higher compute power is being met with increasingly more potent accelerators and the use of large compute clusters. However, the gain in prediction accuracy from large models trained on distributed and accelerated systems comes at the price of a substantial increase in energy demand, and researchers have started questioning the environmental friendliness of such AI methods at scale. Consequently, energy efficiency plays an important role for AI model developers and infrastructure operators alike. The energy consumption of AI workloads depends on the model implementation and the utilized hardware. Therefore, accurate measurements of the power draw of AI workflows on different types of compute nodes is key to algorithmic improvements and the design of future compute clusters and hardware. To this end, we present measurements of the energy consumption of two typical applications of deep learning models on different types of compute nodes. Our results indicate that 1. deriving energy consumption directly from runtime is not accurate, but the consumption of the compute node needs to be considered regarding its composition; 2. neglecting accelerator hardware on mixed nodes results in overproportional inefficiency regarding energy consumption; 3. energy consumption of model training and inference should be considered separately - while training on GPUs outperforms all other node types regarding both runtime and energy consumption, inference on CPU nodes can be comparably efficient. One advantage of our approach is that the information on energy consumption is available to all users of the supercomputer, enabling an easy transfer to other workloads alongside a raise in user-awareness of energy consumption.
translated by 谷歌翻译
Due to the complicated nanoscale structures of current integrated circuits(IC) builds and low error tolerance of IC image segmentation tasks, most existing automated IC image segmentation approaches require human experts for visual inspection to ensure correctness, which is one of the major bottlenecks in large-scale industrial applications. In this paper, we present the first data-driven automatic error detection approach targeting two types of IC segmentation errors: wire errors and via errors. On an IC image dataset collected from real industry, we demonstrate that, by adapting existing CNN-based approaches of image classification and image translation with additional pre-processing and post-processing techniques, we are able to achieve recall/precision of 0.92/0.93 in wire error detection and 0.96/0.90 in via error detection, respectively.
translated by 谷歌翻译
计算方法开始用于设计数据和生成过程所推动的动态视觉身份。在这项工作中,我们探索了这些计算方法,以生成创建定制效率和图像的视觉标识。我们实现了开发的生成设计系统,该设计系统会自动组装黑白视觉模块。该系统生成设计执行两种主要方法的设计:(i)辅助生成;(ii)自动生成。辅助生成方法产生输出,其中模块的放置由以前定义的配置文件确定。另一方面,自动生成方法会产生输出,其中组装模块以描绘输入图像。该系统加快了一个视觉标识设计的设计和部署的过程,并在它们之间生成了视觉连贯性。在本文中,我们可以压制地描述该系统及其成就。
translated by 谷歌翻译
我们研究了图结构识别的问题,即在时间序列之间恢复依赖图的图。我们将这些时间序列数据建模为线性随机网络动力学系统状态的组成部分。我们假设部分可观察性,其中仅观察到一个包含网络的节点子集的状态演变。我们设计了一个从观察到的时间序列计算的新功能向量,并证明这些特征是线性可分离的,即存在一个超平面,该超平面将与连接的节点成对相关的特征群体与与断开对相关的节点相关联。这使得可以训练各种分类器进行因果推理的功能。特别是,我们使用这些功能来训练卷积神经网络(CNN)。由此产生的因果推理机制优于最先进的W.R.T.样品复杂性。受过训练的CNN概括了结构上不同的网络(密集或稀疏)和噪声级别的轮廓。值得注意的是,他们在通过合成网络(随机图的实现)训练时也很好地概括了现实世界网络。最后,提出的方法始终以成对的方式重建图,也就是说,通过确定每对相应的时间序列中的每对节点中是否存在边缘或箭头或不存在箭头。这符合大规模系统的框架,在该系统中,网络中所有节点的观察或处理都令人难以置信。
translated by 谷歌翻译
合成孔径雷达(SAR)数据中的异常值(异常值)的存在以及统计图像模型中的错误指定可能导致推断不准确。为了避免此类问题,提出了基于强大的估计过程的瑞利回归模型,作为模拟此类数据的更现实的方法。本文旨在获得瑞利回归模型参数估计量与异常值的存在。提出的方法考虑了加权最大似然法,并使用模拟和测量的SAR图像提交了数值实验。使用蒙特卡洛模拟来评估有限信号长度中提出的可靠估计器性能,对离群值的敏感性和分解点。例如,非稳定估计器显示相对偏置值$ 65 $ - 折叠比损坏信号中强大方法提供的结果大。在灵敏度分析和分解点方面,强大的方案在两种措施的平均绝对值中分别降低了约96美元\%$和$ 10 \%$,以同情非稳定估计器。此外,使用两个SAR数据集比较了所提出的强稳定方案的地面类型和异常检测结果与文献中的竞争方法。
translated by 谷歌翻译
预计机器人将掌握形状,重量或材料类型各不相同的广泛物体。因此,为机器人提供类似于人类的触觉功能对于涉及人与人机或机器人与机器人相互作用的应用至关重要,尤其是在那些期望机器人掌握和操纵以前未遇到的复杂物体的情况下。成功的对象掌握和操纵的关键方面是使用配备多个高性能传感器的高质量指尖,在特定的接触表面上适当分布。在本文中,我们介绍了使用两种不同类型的市售机器人指尖(Biotac和wts-ft)的使用的详细分析,每个机器人指尖(Biotac和wts-ft)配备了分布在指尖的接触表面上的多个传感器。我们进一步证明,由于指尖的高性能,不需要一种复杂的自适应抓握算法来抓住日常物体。我们得出的结论是,只要相关的指尖表现出较高的灵敏度,基于比例控制器的简单算法就足够了。在量化的评估中,我们还证明,部分由于传感器的分布,基于BioTAC的指尖的性能优于WTS-FT设备,可以使负载升高至850G,并且简单的比例控制器可以适应该载荷即使对象面临重大的外部振动挑战,也要掌握。
translated by 谷歌翻译
现代生活是由连接到互联网的电子设备驱动的。新兴研究领域的新兴研究领域(IoT)已变得流行,就像连接设备数量稳定增加一样 - 现在超过500亿。由于这些设备中的许多用于执行\ gls*{cv}任务,因此必须了解其针对性能的功耗。我们在执行对象分类时报告了NVIDIA JETSON NANO板的功耗概况和分析。作者对使用Yolov5模型进行了有关每帧功耗和每秒(FPS)帧输出的广泛分析。结果表明,Yolov5N在吞吐量(即12.34 fps)和低功耗(即0.154 MWH/Frafe)方面优于其他Yolov5变体。
translated by 谷歌翻译